208 research outputs found

    Deep Learning Techniques for Music Generation -- A Survey

    Full text link
    This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content. We propose a methodology based on five dimensions for our analysis: Objective - What musical content is to be generated? Examples are: melody, polyphony, accompaniment or counterpoint. - For what destination and for what use? To be performed by a human(s) (in the case of a musical score), or by a machine (in the case of an audio file). Representation - What are the concepts to be manipulated? Examples are: waveform, spectrogram, note, chord, meter and beat. - What format is to be used? Examples are: MIDI, piano roll or text. - How will the representation be encoded? Examples are: scalar, one-hot or many-hot. Architecture - What type(s) of deep neural network is (are) to be used? Examples are: feedforward network, recurrent network, autoencoder or generative adversarial networks. Challenge - What are the limitations and open challenges? Examples are: variability, interactivity and creativity. Strategy - How do we model and control the process of generation? Examples are: single-step feedforward, iterative feedforward, sampling or input manipulation. For each dimension, we conduct a comparative analysis of various models and techniques and we propose some tentative multidimensional typology. This typology is bottom-up, based on the analysis of many existing deep-learning based systems for music generation selected from the relevant literature. These systems are described and are used to exemplify the various choices of objective, representation, architecture, challenge and strategy. The last section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P. Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music Generation, Computational Synthesis and Creative Systems, Springer, 201

    Byte Pair Encoding for Symbolic Music

    Full text link
    When used with deep learning, the symbolic music modality is often coupled with language model architectures. To do so, the music needs to be tokenized, i.e. converted into a sequence of discrete tokens. This can be achieved by different approaches, as music can be composed of simultaneous tracks, of simultaneous notes with several attributes. Until now, the proposed tokenizations rely on small vocabularies of tokens describing the note attributes and time events, resulting in fairly long token sequences, and a sub-optimal use of the embedding space of language models. Recent research has put efforts on reducing the overall sequence length by merging embeddings or combining tokens. In this paper, we show that Byte Pair Encoding, a compression technique widely used for natural language, significantly decreases the sequence length while increasing the vocabulary size. By doing so, we leverage the embedding capabilities of such models with more expressive tokens, resulting in both better results and faster inference in generation and classification tasks. The source code is shared on Github, along with a companion website. Finally, BPE is directly implemented in MidiTok, allowing the reader to easily benefit from this method.Comment: EMNLP 2023, source code: https://github.com/Natooz/BPE-Symbolic-Musi

    Collective management of environmental commons with multiple usages: a guaranteed viability approach

    Full text link
    In this paper we address the collective management of environmental commons with multiple usages in the framework of the mathematical viability theory. We consider that the stakeholders can derive from the study of their own socioeconomic problem the variables describing their different usages of the commons and its evolution, and a representation of the desirable states for the commons. We then consider the guaranteed viability kernel, subset of the set of desirable states where it is possible to maintain the state of the commons even when its evolution is represented by several conflicting models. This approach is illustrated on a problem of lake eutrophication.Comment: 22 pages, 4 figure

    miditok: A Python package for MIDI file tokenization

    Full text link
    Recent progress in natural language processing has been adapted to the symbolic music modality. Language models, such as Transformers, have been used with symbolic music for a variety of tasks among which music generation, modeling or transcription, with state-of-the-art performances. These models are beginning to be used in production products. To encode and decode music for the backbone model, they need to rely on tokenizers, whose role is to serialize music into sequences of distinct elements called tokens. MidiTok is an open-source library allowing to tokenize symbolic music with great flexibility and extended features. It features the most popular music tokenizations, under a unified API. It is made to be easily used and extensible for everyone.Comment: Updated and comprehensive report. Original ISMIR 2021 document at https://archives.ismir.net/ismir2021/latebreaking/000005.pd

    Refinement operators to facilitate the reuse of interaction laws in open multi-agent systems

    Get PDF
    ABSTRACT As new software demands and requirements appear, the system and its interaction laws must evolve to support those changes. Languages and models should provide the tools for dealing with this evolution. Poor support on evolution has a negative impact on system maintainability. In this paper, we propose some refinement operators to extend the interaction laws in open multi-agent systems. As an example of this idea, we implemented a customizable application in the supply chain management domain as an open system environmen
    corecore